Supervised Classification of Genetic Sequences for Population Analysis
نویسندگان
چکیده
We used a Support Vector Machine (SVM) algorithm for analysis of biological populations through supervised classification of sequence data. We describe an open-source software which implements the method, and apply the method and the program to analyze environmentally challenged populations of the estuarine fish Fundulus heteroclitus. Specifically, we investigate whether the genetic composition (DNA sequence) of a particular detoxification locus predicts population assignment of fish to chemically contaminated versus clean estuaries. The analysis method uses an SVM algorithm to assign individual fish, characterized by their allelic composition, into a toxic-resistant or ∗The corresponding author. Mailing address: San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 94132. E-mail: [email protected]. non-resistant group. We employed classification error in assignment as a measure of population similarity. The results validate the proposed method by providing supporting evidence for the previously suggested role of AHR1 (aryl hydrocarbon receptor) locus in the toxic response pathway of Fundulus heteroclitus.
منابع مشابه
A Novel Genetic classification of SARS coronavirus-2 following whole nucleic acid and protein alignment of the isolated viruses
Background and aims: The end of 2019 has marked the year, which the human population encountered a novel virus; SARS-CoV-2 that causes a disease namely COVID-19. Here we focused on the genome and protein mutations and subsequently suggested a new classification of the SARS-CoV-2. Materials and Methods: Our study showed that some extra positions in the virus genome play a key role in the SARS-C...
متن کاملMitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.
Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...
متن کاملFisher Discriminant Analysis (FDA), a supervised feature reduction method in seismic object detection
Automatic processes on seismic data using pattern recognition is one of the interesting fields in geophysical data interpretation. One part is the seismic object detection using different supervised classification methods that finally has an output as a probability cube. Object detection process starts with generating a pickset of two classes labeled as object and non-object and then selecting ...
متن کاملMitochondrial DNA variation in wild and hatchery populations of northern pike, Esox lucius L.
Esox lucius is an economically important freshwater species. Mitochondrial cytb, 12SrRNA, and 16SrRNA gene sequences were used in order to clarify the genetic variation and population structure in three E. Lucius populations, i.e., one Wild population (W) and two hatchery populations (Hatchery Population I-HPI and Hatchery Population II-HPII). A total of 55 individuals, with 19 from wild and 1...
متن کاملPopulation structure and variation in Persian sturgeon (Acipenser percicus ) from the Caspian Sea as determind from mitochondrial DNA sequences of the control region
Mitochondria1 DNA (mtDNA) control region sequences were analyzed to evaluate the population genetic structure of Persian sturgeon (Acipenser persicus) in Caspian Sea. A total of 45 specimens were collected from the different locations of the Caspian Sea. MtDNA control region was amplified using PCR. Direct sequencing was performed according standard method. The results showed that 12 haplotypes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005